182

Bio-mathematics, Statistics and Nano-Technologies: Mosquito Control Strategies

Figure 9.5: The clustering of natural and synthesized compounds (Syn1 – Syn13) with

repellent activity toward A. gambiae females (Thireou et al. 2018) based on calculated

physicochemical descriptors.

9.3.5.2

Principal component analysis

Along with cluster analysis, principal component analysis (PCA) is one of the most

often exploited pattern recognition method. If multicollinearity among the studied vari-

ables occurs, it is meaningful to apply PCA in order to reduce the data set and define new

principle variables. Scores plot and loadings plot are used to present the result of analysis

and they serve for the observation of the similar variables. The score plot of the PCA that

presents the distribution of the same set of compounds that was analyzed by HCA on the

basis of the same set of molecular descriptors is presented in Figure 9.7. The score plot

shown in Figure 9.7 indicates significant separation of the majority of synthesized com-

pounds along the PC1 axis that takes into account 47.34% of total variance. Most of the

natural repellent compounds are placed on the positive end on the PC1 axis. The distribu-

tion of the compounds along the PC2 axis, that covers 34.46% of total variability, is mostly

based on their lipophilicity that has the strongest influence on PC2 axis. The influence of

the descriptors on the compounds distribution is determined on the basis of the loadings

plot (not shown).